A Hidden Semi-Markov Model-Based Speech Synthesis System
نویسندگان
چکیده
Recently, a statistical speech synthesis system based on the hidden Markov model (HMM) has been proposed. In this system, spectrum, excitation, and duration of human speech are modeled simultaneously by context-dependent HMMs and speech parameter vector sequences are generated from the HMMs themselves. This system defines a speech synthesis problem in a generative model framework and solves it using the maximum likelihood (ML) criterion. However, there is an inconsistency: although state duration models are explicitly used in the synthesis part of the system, they have not been incorporated in its training part. This inconsistency may degrade the naturalness of synthesized speech. In the present paper, a statistical speech synthesis system based on a hidden semi-Markov model (HSMM), which can be viewed as an HMM with explicit state duration models, is developed and evaluated. The use of HSMMs allows us to incorporate the state duration models explicitly not only in the synthesis part but also in the training part of the system and resolves the inconsistency in the HMM-based speech synthesis system. Subjective listening test results show that the use of HSMMs improves the reported naturalness of synthesized speech. key words: hidden Markov model, hidden semi-Markov model, HMMbased speech synthesis
منابع مشابه
An Overview of Nitech HMM-based for Blizzard Challen
In the present paper, hidden Markov model (HMM) based speech synthesis system developed in Nagoya Institute of Technology (Nitech-HTS) for a competition of text-to-speech synthesis systems using the same speech databases, named Blizzard Challenge 2005, is described. We show an overview of the basic HMM-based speech synthesis system and then recent developments to the latest one such as STRAIGHT...
متن کاملAn overview of nitech HMM-based speech synthesis system for blizzard challenge 2005
In the present paper, hidden Markov model (HMM) based speech synthesis system developed in Nagoya Institute of Technology (Nitech-HTS) for a competition of text-to-speech synthesis systems using the same speech databases, named Blizzard Challenge 2005, is described. We show an overview of the basic HMM-based speech synthesis system and then recent developments to the latest one such as STRAIGHT...
متن کاملMLLR adaptation for hidden semi-Markov model based speech synthesis
This paper describes an extension of maximum likelihood linear regression (MLLR) to hidden semi-Markov model (HSMM) and presents an adaptation technique of phoneme/state duration for an HMM-based speech synthesis system using HSMMs. The HSMM-based MLLR technique can realize the simultaneous adaptation of output distributions and state duration distributions. We focus on describing mathematical ...
متن کاملHidden semi-Markov model based speech synthesis
In the present paper, a hidden-semi Markov model (HSMM) based speech synthesis system is proposed. In a hidden Markov model (HMM) based speech synthesis system which we have proposed, rhythm and tempo are controlled by state duration probability distributions modeled by single Gaussian distributions. To synthesis speech, it constructs a sentence HMM corresponding to an arbitralily given text an...
متن کاملA Bayesian approach to Hidden Semi-Markov Model based speech synthesis
This paper proposes a Bayesian approach to hidden semiMarkov model (HSMM) based speech synthesis. Recently, hidden Markov model (HMM) based speech synthesis based on the Bayesian approach was proposed. The Bayesian approach is a statistical technique for estimating reliable predictive distributions by treating model parameters as random variables. In the Bayesian approach, all processes for con...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- IEICE Transactions
دوره 90-D شماره
صفحات -
تاریخ انتشار 2007